Formant trajectories for acoustic-to-articulatory inversion

نویسندگان

I. Yücel Özbek

Mark Hasegawa-Johnson

Mübeccel Demirekler

چکیده

This work examines the utility of formant frequencies and their energies in acoustic-to-articulatory inversion. For this purpose, formant frequencies and formant spectral amplitudes are automatically estimated from audio, and are treated as observations for the purpose of estimating electromagnetic articulography (EMA) coil positions. A mixture Gaussian regression model with mel-frequency cepstral (MFCC) observations is modified by using formants and energies to either replace or augment the MFCC observation vector. The augmented observation results in 3.4% lower RMS error, and 2.7% higher correlation coefficient, than the baseline MFCC observation. Improvement is especially good for plosive consonants, possibly because formant tracking provides information about the acoustic resonances that would be otherwise unavailable during plosive closure and release.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Introduction of constraints in an acoustic-to-articulatory inversion method based on a hypercubic articulatory table

Our acoustic to articulatory inversion method exploits an original articulatory table structured in the form of a hypercube hierarchy. The articulatory space is decomposed into regions where the articulatory-to-acoustic mapping is linear. Each region is represented by a hypercube. The inversion procedure retrieves articulatory vectors corresponding to an acoustic entry from the hypercube table....

متن کامل

Unsupervised vocal-tract length estimation through model-based acoustic-to-articulatory inversion

Knowledge of vocal-tract (VT) length is a logical prerequisite for acoustic-to-articulatory inversion. Prior work has treated VT length estimation (VTLE) and inversion largely as separate problems. We describe a new algorithm for VTLE based on acoustic-to-articulatory inversion. Our inversion process uses the Maeda model (MM, [1,2]) and combines global search [3] and dynamic programming for tra...

متن کامل

Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion.

Acoustic-to-articulatory inversion is a difficult problem mainly because of the nonlinearity between the articulatory and acoustic spaces and the nonuniqueness of this relationship. To resolve this problem, we have developed an inversion method that provides a complete description of the possible solutions without excessive constraints and retrieves realistic temporal dynamics of the vocal trac...

متن کامل

Vocal tract inversion by cepstral analysis-by-synthesis using chain matrices

Acoustic-to-articulatory inversion for vowels is performed by cepstral analysis-by-synthesis, using chain-matrix calculation of vocal tract (VT) acoustics and the Maeda articulatory model. The derivative of the VT chain matrix with respect to the area function was calculated in a novel efficient manner, and used in the BFGS quasi-Newton method for optimizing a distance measure between input and...

متن کامل

Speech recognition using non-linear trajectories in a formant-based articulatory layer of a multiple-level segmental HMM

This paper describes how non-linear formant trajectories, based on ‘trajectory HMM’ proposed by Tokuda et al., can be exploited under the framework of multiple-level segmental HMMs. In the resultant model, named a non-linear/linear multiple-level segmental HMM, speech dynamics are modeled as non-linear smooth trajectories in the formant-based intermediate layer. These formant trajectories are m...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2009

Formant trajectories for acoustic-to-articulatory inversion

نویسندگان

چکیده

منابع مشابه

Introduction of constraints in an acoustic-to-articulatory inversion method based on a hypercubic articulatory table

Unsupervised vocal-tract length estimation through model-based acoustic-to-articulatory inversion

Modeling the articulatory space using a hypercube codebook for acoustic-to-articulatory inversion.

Vocal tract inversion by cepstral analysis-by-synthesis using chain matrices

Speech recognition using non-linear trajectories in a formant-based articulatory layer of a multiple-level segmental HMM

عنوان ژورنال:

اشتراک گذاری